Model Selection

Low VRAM Requirement

# Low VRAM Requirement

Deepseek R1 0528 FP4

A quantized version of the DeepSeek R1 0528 model from DeepSeek AI, an autoregressive language model based on an optimized Transformer architecture, which can be used for commercial and non-commercial purposes.

Large Language Model

Deepseek R1 0528 Quantized.w4a16

The DeepSeek-R1-0528 model after quantization processing significantly reduces the requirements for GPU memory and disk space by quantizing the weights to the INT4 data type.

Large Language Model

Wan2.1 VACE 1.3B

Wan2.1 is an open and advanced foundational model for video generation, supporting various video generation and editing tasks.

Text-to-Video Supports Multiple Languages

Stable Diffusion 3.5 Large DF11

A losslessly compressed version of stabilityai/stable-diffusion-3.5-large using DFloat11 format, reducing size by 30% while maintaining 100% accuracy

Image Generation

Qwen3 14B FP8 Dynamic

Qwen3-14B-FP8-dynamic is an optimized large language model. By quantizing activation values and weights to the FP8 data type, it effectively reduces GPU memory requirements and improves computational throughput.

Large Language Model

Wan2.1 is an open and advanced large-scale video generation model with top-tier performance, capable of running on consumer-grade GPUs and excelling in multitask processing.

Text-to-Video Supports Multiple Languages

Deepcoder 14B Preview Exl2

DeepCoder-14B-Preview is a code generation model developed based on DeepSeek-R1-Distill-Qwen-14B, focusing on solving verifiable programming problems.

Large Language Model English

The GGUF quantized version of Lumina is a model specifically designed for generating high-quality images, supporting text-prompt-based generation of highly matched images.

Image Generation

Deepseek R1 Distill Qwen 32B Quantized.w8a8

Quantized version of DeepSeek-R1-Distill-Qwen-32B, reducing memory requirements and improving computational efficiency through INT8 weight quantization and activation quantization

Large Language Model

Deepseek R1 Distill Llama 70B FP8 Dynamic

The FP8 quantized version of DeepSeek-R1-Distill-Llama-70B, which optimizes inference performance by reducing the number of bits of weights and activations.

Large Language Model

Quantized version based on PixArt-alpha/PixArt-XL-2-1024-MS, supporting efficient text-to-image tasks

Image Generation English

Sd3.5 Medium Gguf

The GGUF quantized version of Stable Diffusion 3.5 Medium, suitable for text-to-image tasks and capable of running on legacy devices.

Image Generation English

Sd3.5 Large Turbo

The GGUF quantized version of Stable Diffusion 3.5 Large Turbo, suitable for image generation tasks, providing more efficient runtime performance.

Text-to-Image English

This is a text-to-image generation model combining Hands XL, SD 1.5, and FLUX.1-dev technologies, focusing on high-quality image generation.

Image Generation

Llama 3.1 8B Instruct FP8

FP8 quantized version of Meta Llama 3.1 8B Instruct model, featuring an optimized transformer architecture autoregressive language model with 128K context length support.

Large Language Model

FLUX.1 Dev Qint4

FLUX.1-dev is a text-to-image generation model quantized to INT4 format using Optimum Quanto, suitable for non-commercial use.

Text-to-Image English

Meta Llama 3.1 8B Instruct Quantized.w4a16

A quantized version of Meta-Llama-3.1-8B-Instruct, optimized to reduce disk space and GPU memory requirements, suitable for chat assistant scenarios in English business and research.

Large Language Model

Transformers Supports Multiple Languages

Meta Llama 3.1 8B Instruct GPTQ INT4

This is the INT4 quantized version of the Meta-Llama-3.1-8B-Instruct model, quantized using the GPTQ algorithm, suitable for multilingual dialogue scenarios.

Large Language Model

Transformers Supports Multiple Languages

Deepseek Coder V2 Lite Instruct FP8

FP8 quantized version of DeepSeek-Coder-V2-Lite-Instruct, suitable for commercial and research use in English, optimized for inference efficiency.

Large Language Model

MaPO is a reference-free, energy-efficient, and memory-friendly alignment method for text-to-image diffusion models

Koala Lightning 700m

KOALA-Lightning-700M is an efficient text-to-image model trained through knowledge distillation based on SDXL-Lightning, significantly improving inference speed while maintaining generation quality

Image Generation

Llama 2 13B Fp16 French

A French Q&A model fine-tuned based on Llama-2-13b-chat, supporting tasks like Baroque style text generation

Large Language Model Supports Multiple Languages

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase